The AI Agent Trust Gap: Bridging Risk to Reliability | Elastic’s Philipp Krenn

Update: 2025-07-16

Description

The age of ubiquitous AI agents is here, bringing immense potential - and unprecedented risk.

Hosts Conor Bronsdon and Vikram Chatterji open the episode by discussing the urgent need for building trust and reliability into next-generation AI agents. Vikram unveils Galileo's free AI reliability platform for agents, featuring Luna 2 SLMs for real-time guardrails and its Insights Engine for automatic failure mode analysis. This platform enables cost-effective, low-latency production evaluations, significantly transforming debugging. Achieving trustworthy AI agents demands rigorous testing, continuous feedback, and robust guardrailing—complex challenges requiring powerful solutions from partners like Elastic.

Conor welcomes Philipp Krenn, Director of Developer Relations at Elastic, to discuss their collaboration in ensuring AI agent reliability, including how Elastic leverages Galileo's platform for evaluation. Philipp details Elastic's evolution from a search powerhouse to a key AI enabler, transforming data access with Retrieval-Augmented Generation (RAG) and new interaction modes. He discusses Elastic's investment in SLMs for efficient re-ranking and embeddings, emphasizing robust evaluation and observability for production. This collaborative effort aims to equip developers to build reliable, high-performing AI systems for every enterprise.

Chapters:

00:00 Introduction

01:09 Galileo's AI Reliability Platform

01:43 Challenges in AI Agent Reliability

06:17 Insights Engine and Its Importance

11:00 Luna 2: Small Language Models

14:42 Custom Metrics and Agent Leaderboard

19:16 Galileo's Integrations and Partnerships

21:04 Philipp Krenn from Elastic

24:47 Optimizing LLM Responses

25:41 Galileo and Elastic: A Powerful Partnership

28:20 Challenges in AI Production and Trust

30:02 Guardrails and Reliability in AI Systems

32:17 The Future of AI in Customer Interaction

Follow the hosts

Follow⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Atin⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Follow⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Conor⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Follow⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Vikram⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Follow⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠Yash⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Follow Today's Guest(s)

Connect with Philipp on LinkedIn

Learn more about Elastic

Check out Galileo

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Try Galileo⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Agent Leaderboard

Comments

In Channel

Explaining Eval Engineering | Galileo's Vikram Chatterji

2025-12-1937:14

Debunking AI's Environmental Panic | Andy Masley

2025-11-2659:02

The Critical Infrastructure Behind the AI Boom | Cisco CPO Jeetu Patel

2025-11-1901:18:10

Beyond Transformers: Maxime Labonne on Post-Training, Edge AI, and the Liquid Foundation Model Breakthrough

2025-11-1252:30

Architecting AI Agents: The Shift from Models to Systems | Aishwarya Srinivasan, Fireworks AI Head of AI Developer Relations

2025-10-0853:25

The accidental algorithm: Melisa Russak, AI research scientist at WRITER

2025-10-0121:09

If Code Generation is Solved What's Next? | Graphite’s Greg Foster

2025-09-2454:39

Vercel's Playbook for AI Agents: From Vibe Check to Production | Malte Ubl

2025-09-1054:24

From Demo to Defensibility: How to Build an AI Business that Lasts | Aurimas Griciūnas

2025-08-2751:46

Mindset Over Metrics: How to Approach AI Engineering | Hamel Husain

2025-08-2042:09

How AI Velocity is Rewriting the Rules for Engineering Leaders | ChatPRD's Claire Vo

2025-08-1342:55

Building an AI-Native Startup | GrowthX's Marcel Santilli

2025-08-0623:53

Can AI Fix Healthcare? | Corti's Andreas Cleve

2025-07-3047:11

Mastering Multi-Agent Systems | MongoDB’s Mikiko Chandrasekhar

2025-07-2340:23

The AI Agent Trust Gap: Bridging Risk to Reliability | Elastic’s Philipp Krenn

2025-07-1644:11

Architecting Reliable Agentic AI | Cisco’s Giovanna Carofiglio on the AGNTCY Collective

2025-07-0941:02

Taste Is The New Moat | Intangible CEO on Brand, Distribution, and Winning in AI

2025-07-0253:01

The Emerging AI Agent Stack | CrewAI’s João Moura

2025-06-2549:53

AMD's Challenge to NVIDIA: The Open Ecosystem Bet | Anush Elangovan & Sharon Zhou

2025-06-1849:16

Your Key to AI Success is Hiding in Plain Sight | Cohesity's Greg Statton

2025-06-1145:48

00:00

The AI Agent Trust Gap: Bridging Risk to Reliability | Elastic’s Philipp Krenn

#box-pro-ellipsis-176717577111337{-webkit-line-clamp:2;}The AI Agent Trust Gap: Bridging Risk to Reliability | Elastic’s Philipp Krenn

The AI Agent Trust Gap: Bridging Risk to Reliability | Elastic’s Philipp Krenn

Conor Bronsdon

The AI Agent Trust Gap: Bridging Risk to Reliability | Elastic’s Philipp Krenn